\section{Conclusion}
Relation clustering is crucial to many downstream natural language processing applications, including relation extraction, question answering, summarization and machine translation. Previous works, such as approaches based on distribution similarity or approaches using parallel and comparable corpora, are not sufficient to produce high precision relation clusters. This work exploits the refined temporal hypotheses in that are useful to accurately identify semantically equivalent relation phrases in news streams. We present a novel algorithm, \sys\ based on a graphical model, which is able to effectively employ the proposal temporal hypotheses. The joint model of \sys\ also allows for the strong flexibility to include new features and constraints.

We present detailed experiments to analyze the performance of \sys\ as well as many competitive baselines. Additionally, this work introduces a new evaluation metrics regarding only diversified relation clusters which are more useful but are challenging to generate. Empirical study shows that \sys\ outperforms baselines' precisions with significant margins, especially on the diversified relation clusters. It proves the power of the proposed temporal hypotheses and the effectiveness of our joint model. Ablation test demonstrates that the temporal dimensions of the news is critical to obtain the high precision. We collect and release an annotated corpus of timestamped news articles in order to spur future research. 

%For example, textual entailment~\cite{dagan2009recognizing}, which finds a phrase inferring another phrase, is very related to relation clustering task. It is natural to conjecture that certain temporal hypotheses are useful to textual entailment; relation clusters are used in different tasks in various ways, it is interesting to develop a general interface to improve the performances of different downstream NLP tasks.






%We argue that weak supervision is promising method for scaling
%information extraction to the level where it can handle the myriad,
%different relations on the Web.  By using the contents of a database
%to heuristically label a training corpus, we may be able to
%automatically learn a nearly unbounded number of relational
%extractors.  Since the processs of matching database tuples to
%sentences is inherently heuristic, researchers have proposed
%multi-instance learning algorithms as a means for coping with the
%resulting noisy data. Unfortunately, previous approaches assume that
%all relations are {\em disjoint} --- for example they cannot extract
%the pair {\tt Founded(Jobs, Apple)} and {\tt CEO-of(Jobs, Apple)},
%because two relations are not allowed to have the same arguments.

%This paper presents a novel approach for multi-instance learning with
%overlapping relations that combines a sentence-level extraction model
%with a simple, corpus-level component for aggregating the individual
%facts.  We apply our model to learn extractors for NY Times text using
%weak supervision from Freebase. Experiments show improvements
%for both sentential and aggregate (corpus level) extraction, and demonstrate
%that the approach is computationally efficient.

%Our early progress suggests many interesting directions.
%By joining two or more Freebase tables, we can generate many more
%matches and learn more relations. We also wish to refine our model in
%order to improve precision. For example, we would like to add type
%reasoning about entities and selectional preference constraints for
%relations.  Finally, we are also interested in applying the overall learning
%approaches to other tasks that could be modeled with weak supervision,
%such as coreference and named entity classification.
